Discrete Cepstrum Coefficients as Perceptual Features
نویسندگان
چکیده
Cepstrum coefficients are widely used as features for both speech and music. In this paper, the use of discrete cepstrum coefficients is considered, which are computed from sinusoidal peaks in the short time spectrum. These coefficients are very interesting as features for pattern recognition applications since they allow to represent spectra by points in a multidimensional vector space. A new Mel frequency warping method is proposed that allows to compute the spectral envelope on the Mel scale which, by contrast to current estimation techniques, does not rely on manually set parameters. Furthermore, the robustness and perceptual relevance of the coefficients are studied and improved.
منابع مشابه
Modified Perceptual Linear Prediction Liftered Cepstrum (MPLPLC) Model for Pop Cover Song Recognition
Most of the features of Cover Song Identification (CSI), for example, Pitch Class Profile (PCP) related features, are based on the musical facets shared among cover versions: melody evolution and harmonic progression. In this work, the perceptual feature was studied for CSI. Our idea was to modify the Perceptual Linear Prediction (PLP) model in the field of Automatic Speech Recognition (ASR) by...
متن کاملComparison of Speech Features on the Speech Recognition Task
In the present work we overview some recently proposed discrete Fourier transform (DFT)and discrete wavelet packet transform (DWPT)-based speech parameterization methods and evaluate their performance on the speech recognition task. Specifically, in order to assess the practical value of these less studied speech parameterization methods, we evaluate them in a common experimental setup and comp...
متن کاملFrequency warping and robust speaker verification: a comparison of alternative mel-scale representations
Accuracy of speaker verification is high under controlled conditions but falls off rapidly in the presence of interfering sounds. This is because spectral features, such as Mel-frequency cepstral coefficients (MFCCs), are sensitive to additive noise. MFCCs are a particular realization of warped-frequency representation with low-frequency focus. But there are several alternative, potentially mor...
متن کاملSpectral Subband Centroids as Complementary Features for Speaker Authentication
Most conventional features used in speaker authentication are based on estimation of spectral envelopes in one way or another, e.g., Mel-scale Filterbank Cepstrum Coefficients (MFCCs), Linear-scale Filterbank Cepstrum Coefficients (LFCCs) and Relative Spectral Perceptual Linear Prediction (RASTA-PLP). In this study, Spectral Subband Centroids (SSCs) are examined. These features are the centroid...
متن کاملNoise-Robust Speech Features Based on Cepstral Time Coefficients
In this paper, we investigate the noise-robustness of features based on the cepstral time coefficients (CTC). By cepstral time coefficients, we mean the coefficients obtained from applying the discrete cosine transform to the commonly used mel-frequency cepstral coefficients (MFCC). Furthermore, we apply temporal filters used for computing delta and acceleration dynamic features to the CTC, res...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003